Using Conflicts Among Multiple Base Classifiers to Measure the Performance of Stacking

نویسندگان

Wei Fan

Salvatore J. Stolfo

Philip K. Chan

چکیده

We analyze the machine learning bias of stacking and point out the conflict problem. Conflicts are defined as base data with different class labels that produced the same predictions by a set of base classifiers. Based on conflicts, we propose conflict-based accuracy estimate to determine the overall accuracy of a stacked classifier and conflict-based accuracy improvement estimate to determine the overall accuracy improvement over base classifiers. We discuss some popular metrics for comparing and evaluating a set of classifiers: coverage, correlated error, diversity and specialty, and show that these metrics do not accurately estimate the overall accuracy of a stacked classifier system. From experimental results, we demonstrate that conflict-based accuracy estimate is an effective measure to predict overall performance and compare different stacked systems, and conflict-based accuracy improvement estimate is a good measure to project the overall accuracy improvement.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault Detection of Bearings Using a Rule-based Classifier Ensemble and Genetic Algorithm

This paper proposes a reduct construction method based on discernibility matrix simplification. The method works with genetic algorithm. To identify potential problems and prevent complete failure of bearings, a new method based on rule-based classifier ensemble is presented. Genetic algorithm is used for feature reduction. The generated rules of the reducts are used to build the candidate base...

متن کامل

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

روشی جدید جهت استخراج موجودیت‌های اسمی در عربی کلاسیک

In Natural Language Processing (NLP) studies, developing resources and tools makes a contribution to extension and effectiveness of researches in each language. In recent years, Arabic Named Entity Recognition (ANER) has been considered by NLP researchers due to a significant impact on improving other NLP tasks such as Machine translation, Information retrieval, question answering, query result...

متن کامل

Classifier Subset Selection for the Stacked Generalization Method Applied to Emotion Recognition in Speech

In this paper, a new supervised classification paradigm, called classifier subset selection for stacked generalization (CSS stacking), is presented to deal with speech emotion recognition. The new approach consists of an improvement of a bi-level multi-classifier system known as stacking generalization by means of an integration of an estimation of distribution algorithm (EDA) in the first laye...

متن کامل

Troika - An improved stacking schema for classification tasks

The idea of ensemble methodology is to build a predictive model by integrating multiple models. It is well-known that ensemble methods can be used for improving prediction performance. Researchers from various disciplines such as statistics, machine learning, pattern recognition, and data mining have considered the use of ensemble methodology. Stacking is a general ensemble method in which a nu...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Using Conflicts Among Multiple Base Classifiers to Measure the Performance of Stacking

نویسندگان

چکیده

منابع مشابه

Fault Detection of Bearings Using a Rule-based Classifier Ensemble and Genetic Algorithm

Application of ensemble learning techniques to model the atmospheric concentration of SO2

روشی جدید جهت استخراج موجودیت‌های اسمی در عربی کلاسیک

Classifier Subset Selection for the Stacked Generalization Method Applied to Emotion Recognition in Speech

Troika - An improved stacking schema for classification tasks

عنوان ژورنال:

اشتراک گذاری